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Abstract 

While there has been a great deal of work on the development of reasoning 
algorithms for expressive description logics, in most cases only Tbox reasoning is 
considered. In this paper we present an algorithm for combined Tbox and Abox 
reasoning in the SHIQ description logic. This algorithm is of particular interest 
as it can be used to decide the problem of (database) conjunctive query containment 
w.r.t. a schema. Moreover, the realisation of an efficient implementation should 
be relatively straightforward as it can be based on an existing highly optimised 
implementation of the Tbox algorithm in the FaCT system. 



1 Motivation 

A description logic (DL) knowledge base (KB) is made up of two parts, a termino- 
logical part (the terminology or Tbox) and an assertional part (the Abox), each part 
consisting of a set of axioms. The Tbox asserts facts about concepts (sets of objects) 
and roles (binary relations), usually in the form of inclusion axioms, while the Abox 
asserts facts about individuals (single objects), usually in the form of instantiation ax- 
ioms. For example, a Tbox might contain an axiom asserting that Man is subsumed by 
Animal, while an Abox might contain axioms asserting that both Aristotle and Plato 
are instances of the concept Man and that the pair (Aristotle, Plato) is an instance of 
the role Pupil-of. 

"This paper will appear in the Proceedings of the 17th International Conference on Automated Deduction 
(CADE-17), Lecture Notes in Computer Science, Germany, 2000. Springer Verlag. 
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For logics that include full negation, all common DL reasoning tasks are reducible 
to deciding KB consistency, i.e., determining if a given KB admits a non-empty inter- 
pretation [6]. There has been a great deal of work on the development of reasoning 
algorithms for expressive DLs [2, 12, 16, 11], but in most cases these consider only 
Tbox reasoning (i.e., the Abox is assumed to be empty). With expressive DLs, deter- 
mining consistency of a Tbox can often be reduced to determining the satisfiability of 
a single concept [2, 23, 3], and — as most DLs enjoy the tree model property (i.e., if a 
concept has a model, then it has a tree model) — this problem can be decided using a 
tableau-based decision procedure. 

The relative lack of interest in Abox reasoning can also be explained by the fact that 
many applications only require Tbox reasoning, e.g., ontological engineering [15, 20] 
and schema integration [10]. Of particular interest in this regard is the DL STLIQ [18], 
which is powerful enough to encode the logic VCR [10], and which can thus be used 
for reasoning about conceptual data models, e.g., Entity-Relationship (ER) schemas [9]. 
Moreover, if we think of the Tbox as a schema and the Abox as (possibly incomplete) 
data, then it seems reasonable to assume that realistic Tboxes will be of limited size, 
whereas realistic Aboxes could be of almost unlimited size. Given the high complex- 
ity of reasoning in most DLs [23, 7], this suggests that Abox reasoning could lead to 
severe tractability problems in realistic applications. 1 

However, STLIQ Abox reasoning is of particular interest as it allows VCR schema 
reasoning to be extended to reasoning about conjunctive query containment w.r.t. a 
schema [8]. This is achieved by using Abox individuals to represent variables and 
constants in the queries, and to enforce co-references [17]. In this context, the size of 
the Abox would be quite small (it is bounded by the number of variables occurring in 
the queries), and should not lead to severe tractability problems. 

Moreover, an alternative view of the Abox is that it provides a restricted form of 
reasoning with nominals, i.e., allowing individual names to appear in concepts [22, 
5, 1]. Unrestricted nominals are very powerful, allowing arbitrary co-references to 
be enforced and thus leading to the loss of the tree model property. This makes it 
much harder to prove decidability and to devise decision procedures (the decidability 
of STLIQ with unrestricted nominals is still an open problem). An Abox, on the other 
hand, can be modelled by a forest, a set of trees whose root nodes form an arbitrarily 
connected graph, where number of trees is limited by the number of individual names 
occurring in the Abox. Even the restricted form of co-referencing provided by an 
Abox is quite powerful, and can extend the range of applications for the DLs reasoning 
services. 

In this paper we present a tableaux based algorithm for deciding the satisfiability of 
unrestricted STLXQ KBs (i.e., ones where the Abox may be non-empty) that extends 
the existing consistency algorithm for Tboxes [18] by making use of the forest model 
property. This should make the realisation of an efficient implementation relatively 
straightforward as it can be based on an existing highly optimised implementation of 
the Tbox algorithm (e.g., in the FaCT system [14]). A notable feature of the algo- 
rithm is that, instead of making a unique name assumption w.r.t. all individuals (an 
assumption commonly made in DLs [4]), increased flexibility is provided by allowing 

'Although suitably optimised algorithms may make reasoning practicable for quite large Aboxes [13]. 
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the Abox to contain axioms explicitly asserting inequalities between pairs of individual 
names (adding such an axiom for every pair of individual names is obviously equivalent 
to making a unique name assumption). 

2 Preliminaries 

In this section, we introduce the DL STLXQ. This includes the definition of syntax, se- 
mantics, inference problems (concept subsumption and satisfiability, Abox consistency, 
and all of these problems with respect to terminologies 2 ), and their relationships. 

STLIQ is based on an extension of the well known DL ACC [24] to include tran- 
sitively closed primitive roles [21]; we call this logic S due to its relationship with 
the proposition (multi) modal logic S4( m ) [23]. 3 This basic DL is then extended with 
inverse roles (X), role hierarchies (TL), and qualifying number restrictions (Q). 

Definition 2.1 

Let C be a set of concept names and R a set of role names with a subset R + C R of 
transitive role names. The set of roles is R U {R~ | R G R}. To avoid considering 
roles such as R , we define a function Inv on roles such that lnv(i?) = R~ if R is 
a role name, and lnv(i?) = S if R = S~ . We also define a function Trans which 
returns true iff R is a transitive role. More precisely, Tra ns(R) = true iff R G R+ or 
InvLR) G R+. 

A role inclusion axiom is an expression of the form R C S, where R and S are 
roles, each of which can be inverse. A role hierarchy is a set of role inclusion axioms. 
For a role hierarchy TZ, we define the relation E to be the transitive-reflexive closure 
of C over K U {lnv(i?) C Inv(S) | R C S G TZ}. A role R is called a sub-role (resp. 
super-role) of a role S if R^S (resp. S S.R). A role is simple if it is neither transitive 
nor has any transitive sub-roles. 

The set of <S7£T<2 -concepts is the smallest set such that 

• every concept name is a concept, and, 

• if C, D are concepts, R is a role, S is a simple role, and n is a nonnegative 
integer, then C n D, C U D, -.C, \/R.C, 3R.C, ^nS.C, and ^nS.C are also 
concepts. 

A general concept inclusion axiom (GCI) is an expression of the form C C D for two 
SHTQ-concepts C and D. A terminology is a set of GCIs. 

Let I = {a, 6, c . . . } be a set of individual names. An assertion is of the form a : C, 
(a, b):R, or a b for a, b G I, a (possibly inverse) role R, and a SHI Q-concept C. 
An Abox is a finite set of assertions. 

Next, we define semantics of SHTQ and the corresponding inference problems. 

2 We use terminologies instead of Tboxes to underline the fact that we allow for general concept inclusions 
axioms and do not disallow cycles. 

3 The logic S has previously been called ACC R + , but this becomes too cumbersome when adding letters 
to represent additional features. 
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Definition 2.2 

An interpretation X = (A 1 , x ) consists of a set A x , called the domain ofX, and a 
valuation x which maps every concept to a subset of A x and every role to a subset of 
A x x A x such that, for all concepts C, D, roles R, S, and non-negative integers n, 
the following equations are satisfied, where (JM denotes the cardinality of a set M and 
{R T ) + the transitive closure of R 1 : 

R x = (R x )+ for each role R e R+ 

(R~) x = {(x,y) | (y,x) G R 1 } (inverse roles) 

(CUD) 1 = C x f)D x (conjunction) 
(CUD) 1 = C X UD X (disjunction) 
(-C) x = A X \C X (negation) 
(3R.C) X = {x | 3y.(x, y) e R x and y e C x } (exists restriction) 

(Vi?.C) x = {x | Vy.(x,y) € R x implies y G C x } (value restriction) 

(^nR.C) x = {x | §{y.(x,y) G R x andy e C 1 } ^ n} (^-number restriction) 
(^nR.C) x = {x | ${y.{x,y) G R x andy G C 1 } ^ n} (^-number restriction) 
An interpretation X satisfies a roJe hierarchy 1Z iff R x C S 1 - 1 for each R \Z S in 1Z. 
Such an interpretation is called a model of 7\L (written X \= TZ). 

An interpretation X satisfies a terminology T iff C x C _D X for each GCJ CCD 
in T. Sudi an interpretation is called a model of T (written X \= T). 

A concept C is called satisfiable with respect to a role hierarchy TZ and a termi- 
nology T iff there is a model XofTZ and T with C x ^ 0. A concept D subsumes a 
concept C w.r.t. TZ and T iff C x C D x holds for each model XofTZ and T. For an 
interpretation X, an element x G A 1 is called an instance of a concept C iff x G C x . 

For Aboxes, an interpretation maps, additionally, each individual a G I to some 
element a x G A x . An interpretation X satisfies an assertion 

a : C iff a x G C x 
(a,b):R iff (a x ,b x ) G R x ', and 
a £ & iff a x ^ & x 

An Abox .4 is consistent w.r.t. 7£ and T iff there is a model X ofTZ and T that satisfies 
each assertion in A. 

For DLs that are closed under negation, subsumption and (un)satisfiability can be mutu- 
ally reduced: C C D iff C n ->D is unsatisfiable, and C is unsatisfiable iff C C A n -i A 
for some concept name A. Moreover, a concept C is satisfiable iff the Abox {a : C} is 
consistent. It is straightforward to extend these reductions to role hierarchies, but ter- 
minologies deserve special care: In [2, 23, 3], the internalisation of GCIs is introduced, 
a technique that reduces reasoning w.r.t. a (possibly cyclic) terminology to reasoning 
w.r.t. the empty terminology. For SHXQ, this reduction must be slightly modified. The 
following Lemma shows how general concept inclusion axioms can be internalised us- 
ing a "universal" role U, that is, a transitive super-role of all roles occurring in T and 
their respective inverses. 

Lemma 2.3 Let C, D be concepts, A an Abox, T a terminology, and TZ a role hierar- 
chy. We define 
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Let U be a transitive role that does not occur in T, C, D, A, or TZ. We set 

Kjj := TZ U {R C U, lnv(i?) QU\R occurs in T, C, D, A or TZ}. 

• C is satisfiable w.r.t. T and TZiffC l~l Ct n VU.Ct is satisfiable w.r.t. TZjj. 

• D subsumes C with respect to T andlZ iff 'Cn-iDnCr nV?7 .Ct is unsatisfiable 
w.r.t. TZ\j. 

• A is consistent with respect to TZ and T iffAU {a : Ct nVC/.CV | a occurs in A} 
is consistent w.r.t. TZ\j. 

The proof of Lemma 2.3 is similar to the ones that can be found in [23, 2]. Most 
importantly, it must be shown that, (a) if a SWZQ-concept C is satisfiable with respect 
to a terminology T and a role hierarchy TZ, then C, T have a connected model, i. e., a 
model where any two elements are connect by a role path over those roles occuring in C 
and T, and (b) if y is reachable from x via a role path (possibly involving inverse roles), 
then (a;, y) G U x . These are easy consequences of the semantics and the definition of 
U. 

Theorem 2.4 

Satisfiability and subsumption of SHIQ-concepts w.r.t. terminologies and role hier- 
archies are polynomially reducible to (un)satisfiability of SHTQ-concepts w.r.t. role 
hierarchies, and therefore to consistency of SHIQ-Aboxes w.r.t. role hierarchies. 

Consistency of SHTQ-Aboxes w.r.t. terminologies and role hierarchies is polyno- 
mially reducible to consistency of SHTQ-Aboxes w.r.t. role hierarchies. 

3 A SHIQ-Abox Tableau Algorithm 

With Theorem 2.4, all standard inference problems for SHTQ-concepts and Aboxes 
can be reduced to Abox-consistency w.r.t. a role hierarchy. In the following, we present 
a tableau-based algorithm that decides consistency of SHTQ- Aboxes w.r.t. role hier- 
archies, and therefore all other SHTQ inference problems presented. 

The algorithm tries to construct, for a SHTQ-Abox A, a tableau for A, that is, an 
abstraction of a model of A. Given the notion of a tableau, it is then quite straightfor- 
ward to prove that the algorithm is a decision procedure for Abox consistency. 

3.1 A Tableau for Aboxes 

In the following, if not stated otherwise, C, D denote <S7YZ<2-concepts, TZ a role hierar- 
chy, A an Abox, the set of roles occurring in A and TZ together with their inverses, 
and lj[ is the set of individuals occurring in A. 

Without loss of generality, we assume all concepts C occurring in assertions a : C G 
A to be in NNF, that is, negation occurs in front of concept names only. Any SHTQ- 
concept can easily be transformed into an equivalent one in NNF by pushing negations 
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inwards using a combination of DeMorgan's laws and the following equivalences: 



^(3R.C) = (W2.-.C) -(Vi?.C) ee (3R.^C) 

-.«ni?.C) ee >(n + l)i?.C -.(^n.R.C) ee <(n-l)i?.C where 

<(-l)i2.C := Arini forsomeAeC 

For a concept C we will denote the NNF of -iC by ~C. Next, for a concept C, 
clos(C) is the smallest set that contains C and is closed under sub-concepts and ~. We 
use clos(_4) := UoCe,4 clos(C) for the closure clos(C) of each concept C occurring 
in A. It is not hard to show that the size of clos(.A) is polynomial in the size of A. 

Definition 3.1 

T = (S, £, £, 3) is a tableau for A w.r.t. 1Z iff 

• S is a non-empty set, 

• £j : S — > 2 clos (' 4 ' maps each element in S to a set of concepts, 

• £ : R4 — > 2 s xS maps each roJe to a set of pairs of elements in S, and 

• 3 : 1,4 — > S maps individuals occurring in A to elements in S. 

Furthermore, for all s,t £ S, C, Ci , C 2 e clos(.A), and R,Se R4, T satisfies: 

(PI) ifC e £(*), then -.C ^ £(*), 
(P2) if Ci nC 2 £ £(s), then Ci e £(s) and C 2 £ 
(P3) ifC x UC 2 £ -C(s), then C\ e £(s) orC 2 £ £(s), 
(P4) if VS.C e £(s) and (s, i) e £(S), then C 6 

CP5) ifBS.C £ £(s), then there is some t e S such that {s, t) e £(S) and C e 
(P6) ifVS.C £ £(s) and (s,t) £ £(#) forsomeR with Trans(i?), thenVR.C £ 

(P7) £ £(#) iff(y,a;> £ £(lnv(i2)), 

CPS) if (s,i) £ E(R) and EES', then (s,t) £ £(5), 
CP9) if <nS.C £ £(s), then tt5" T (s, C) < n, 
(P10) if^nS.C £ £(s), then tJ5 T (s, C) > n, 

(Pll) if (ix n 5 C) £ £(s) and (s,i) £ £(5) then C £ or-C £ 

CPi2) ifa:Ce A, then C £ £(3(o)), 

CPi3) if(a,b):R £ A then (5(o),J(6)> £ £(#), 

CPi4) if a jkb&A, then 3(a) + 3(b), 

where x is a place-holder for both < and ^, and S T (s,C) := {t £ S | (s,i) £ 
£(S*) andC £ £(*)}. 

Lemma 3.2 A SHIQ-Abox A is consistent w.r.t. 1Z iff there exists a tableau for A 
w.r.t. 1Z. 



6 



Proof: For the if direction, if T = (S, L, £, 3) is a tableau for A w.r.t. TZ, a model 
X = (A 1 , • I ) of A and 7?. can be defined as follows: 



A 1 := S 




£(#)+ ifTrans(i?) 
£(i2) U [J ^ otherwise 



P KR,P^R 



where £(i?) + denotes the transitive closure of £(i?). The interpretation of non-transitive 
roles is recursive in order to correctly interpret those non-transitive roles that have a 
transitive sub-role. From the definition of R x and (P8), it follows that, if (s, t) G S 1 , 
then either (s, t) G £(5*) or there exists a path (s, si), (si, S2}, ■ • ■ , (s n , t) G £(-R) for 
some R with Trans(i?) and Ri±S. 

Due to (P8) and by definition of X, we have that X is a model of 7\L 
To prove that X is a model of A, we show that C G £(s) implies s G C x for any 
s G S. Together with (P12), (P13), and the interpretation of individuals and roles, this 
implies that X satisfies each assertion in A. This proof can be given by induction on 
the length |C|| of a concept C in NNF, where we count neither negation nor integers in 
number restrictions. The only interesting case is C = VS.E: let t G S with (s, t) G S 1 . 
There are two possibilities: 

• (s, t) G £(5). Then (P4) implies E G L{t). 

• {s, t) £(S*). Then there exists a path (s, si), (si, S2}, ■ • ■ , (s„, i) G £(i?) for 
some i? with Trans(i?) and R E5. Then (P6) implies Vi?.£; G £(sj) for all 
1 < i < n, and (P4) implies E e Z(t). 

In both cases, t G i? 1 by induction and hence s G C 1 . 

For the converse, for X = (A J , J ) a model of A w.r.t. 7^, we define a tableau 
T = (S, £, £, 3) for ^ and ft as follows: 

S:=A J , E(R):=R X , il(s) := {C G clos(*4) | s G C 1 }, and 3(a) = a 1 . 
It is easy to demonstrate that T is a tableau for D. ■ 

3.2 The Tableau Algorithm 

In this section, we present a completion algorithm that tries to construct, for an input 
Abox A and a role hierarchy 1Z, a tableau for A w.r.t. TZ. We prove that this algorithm 
constructs a tableau for A and TZ iff there exists a tableau for A and TZ, and thus decides 
consistency of SHXQ Aboxes w.r.t. role hierarchies. 

Since Aboxes might involve several individuals with arbitrary role relationships 
between them, the completion algorithm works on a forest rather than on a tree, which 
is the basic data structure for those completion algorithms deciding satisfiability of 
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a concept. Such a forest is a collection of trees whose root nodes correspond to the 
individuals present in the input Abox. In the presence of transitive roles, blocking is 
employed to ensure termination of the algorithm. In the additional presence of inverse 
roles, blocking is dynamic, i.e., blocked nodes (and their sub-branches) can be un- 
blocked and blocked again later. In the additional presence of number restrictions, 
pairs of nodes are blocked rather than single nodes. 

Definition 3.3 

A completion forest T for a SHIQ Abox A is a collection of trees whose distin- 
guished root nodes are possibly connected by edges in an arbitrary way. Moreover, 
each node x is labelled with a setL(x) C clos(_4) and each edge (x, y) is labelled with 
a set L((x,y)) C TZa of (possibly inverse) roles occurring in A. Finally, completion 
forests come with an explicit inequality relation on nodes and an explicit equality 
relation = which are implicitly assumed to be symmetric. 

If nodes x and y are connected by an edge (x, y) with R G H((x, y)) and R&.S, 
then y is called an S'-successor of x and x is called an I nv(5) -predecessor of y. If y is 
an S -successor or an Inv (S) -predecessor of x, then y is called an S -neighbour of x. A 
node y is a successor (resp. predecessor or neighbour) of y if it is an S -successor (resp. 
S -predecessor or S -neighbour) ofy for some role S. Finally, ancestor is the transitive 
closure of predecessor. 

For a role S, a concept C and a node x in T we define S :F (x, C) by 

S :F (x, C) := {y | y is S -neighbour of x and C G £(?/)}• 

A node is blocked iff it is not a root node and it is either directly or indirectly 
blocked. A node x is directly blocked iff none of its ancestors are blocked, and it has 
ancestors x', y and y' such that 

1. y is not a root node and 

2. x is a successor ofx' and y is a successor ofy' and 

3. L(x) = L(y) and L(x') = &(y') and 

4. Z({x',x)) =£«*/, y)). 

In this case we will say that y blocks x. 

A node y is indirectly blocked iff one of its ancestors is blocked, or it is a successor 
of a node x and L((x, y)) = 0; the latter condition avoids wasted expansions after an 
application of the ^-rule. 

Given a SHI Q- Abox A and a role hierarchy 1Z, the algorithm initialises a comple- 
tion forest Ta consisting only of root nodes. More precisely, Ta contains a root node 
Xq for each individual at G Ia occurring in A, and an edge (x l Q , x J ) if A contains an 
assertion (a, ,dj):R for some R. The labels of these nodes and edges and the relations 
^ and = are initialised as follows: 

£(4) := {C | a t :C G A}, 
L((xl4)) := {R\{ai, aj ):ReA}, 
x o 7^ x o iff a i 7^ a j e -4> m d 
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the ^-relation is initialised to be empty. Ta is then expanded by repeatedly applying 
the rules from Figure 1. 

For a node x, L(x) is said to contain a clash if, for some concept name A G C, 
{A, -iA} C L(x), or if there is some concept ^nS.C G L(x) and x has n + 1 S- 
neighbours y , ■ ■ ■ ,y n with C G £/(f/i) and yi jl= yj for all < i < j < n. A 
completion forest is clash-free if none of its nodes contains a clash, and it is complete 
if no rule from Figure 1 can be applied to it. 

For a SHTQ-Abox A, the algorithm starts with the completion forest Ta- It ap- 
plies the expansion rules in Figure 1, stopping when a clash occurs, and answers "A 
is consistent w.r.t. TZ" iff the completion rules can be applied in such a way that they 
yield a complete and clash-free completion forest, and "A and is inconsistent w.r.t. 1Z" 
otherwise. 

Since both the ^-rule and the < r -rule are rather complicated, they deserve some 
more explanation. Both rules deal with the situation where a concept ^nR.C G L(x) 
requires the identification of two i?-neighbours y, z of x that contain C in their labels. 
Of course, y and z may only be identified if y z is not asserted. If these conditions 
are met, then one of the two rules can be applied. The ^-rule deals with the case where 
at least one of the nodes to be identified, namely y, is not a root node, and this can lead 
to one of two possible situations, both shown in Figure 2. The upper situation occurs 
when both y and z are successors of x. In this case, we add the label of y to that of 
z, and the label of the edge (a;, y) to the label of the edge (x,z). Finally, z inherits all 
inequalities from y, and L((x, y)) is set to 0, thus blocking y and all its successors. 

The second situation occurs when both y and z are neighbours of x, but z is the 
predecessor of x. Again, L(y) is added to L(z), but in this case the inverse of L((x,y}) 
is added to £>((z, x)), because the edge (x, y) was pointing away from x while (z, x) 
points towards it. Again, z inherits the inequalities from y and L((x, y)) is set to 0. 

The ^ r rule handles the identification of two root nodes. An example of the whole 
procedure is given in the lower part of Figure 2. In this case, special care has to be taken 
to preserve the relations introduced into the completion forest due to role assertions in 
the Abox, and to memorise the identification of root nodes (this will be needed in order 
to construct a tableau from a complete and clash-free completion forest). The < r rule 
includes some additional steps that deal with these issues. Firstly, as well as adding 
&(y) to £j(z), the edges (and their respective labels) between y and its neighbours 
are also added to z. Secondly, L(y) and all edges going from/to y are removed from 
the forest. This will not lead to dangling trees, because all neighbours of y became 
neighbours of z in the previous step. Finally, the identification of y and z is recorded 
in the = relation. 

Lemma 3.4 Let Abe a SHTQ-Abox and 1Z a role hierarchy. The completion algo- 
rithm terminates when started for A and 1Z. 

Proof: Let m = jjclos(^l), n = |R^|, and n max := max{n | ^nR.C G clos(^4)}. 
Termination is a consequence of the following properties of the expansion rules: 

1 . The expansion rules never remove nodes from the forest. The only rules that 
remove elements from the labels of edges or nodes are the ^- and ^ r -rule, which 
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n-rule: 


if 1. 


r7i n CJo G IL(t\ t is not indirectlv hlocked and 




2. 


iCi C 2 i <t H(x) 




then 




□-rule: 


if 1. 


Cl-\ Cln £,\t\ t is not indirpctlv hloclcpd and 




2. 


ICi Col- n L(x\ — 




then 


£(2;) — ► £(x) U l-El for some _E G ICi , C?\ 


3 -rule: 


if 1. 


3S.C G 'C(x), x is not blocked, and 




2. 


x has no /S-neighbour y with C £ ^(v) 




then 


create a new node y with y)) := {5'} and £->{y) := {C} 


V-rule: 


if 1. 


fP* G £ A t\ T*is not indirpptlv h1opl<rpd and 

V . v_ i**- 1 \ Ju J . Ju IlJ 11\J L 11 V. V Ll V L/lVJ^lVL^L-l, LlllU 




2. 


there is an S'-npi phhour 11 of t with CI (f £>(n\ 

L11V~1\~ 1l5 L4.ll kJ llV^lt^lll'WLlI (J V 7 1 tly VV1L11 \T r**> 1 IJ 1 




then 


£fi/) — -> £M U \C\ 
\ y ) j ^ l w j 


V+-rule: 


if 1. 


\f S CI P £At) t i s not indirectlv blocked and 




2. 


there is some R with Trans(_R) and Rl±S, 




3. 


there is an /?-neif*hhonr it of t with V/? O £Ait) 

11 1L_ 1 \_- 1l5 Ll 1 1 J. L I11^1£^1IL/V Ll 1 U V V 1 VV1L11 V-iL. y_ i*--* I Ly / 




then 


> U {V-R.C} 


r- \i /l /l p^—nilp* 
L,fl(J(JCi& 1U1C. 


if 1. 


7i **i i7 i ^ P,i k* 1 t iq not iTidifPPtlv hloplrpd and 

1 U^J / L iJ> v ' J ~ V /' 11»-"- lllUXLCL/Uj UlVJl^lxCLl, 0.11L1 




2. 


there is an S'-neiphhour 11 of t with 4 CI <~^C1 \ Pi £i( 7/1 — (?) 

U1L1L 1l5 till llV^lt^lll/l'LlI Ly I ' 1 tLi VV1L11 I v_y j V f I I f~*J \KJ 1 — yy 




then 


fV^ s fY^i^ 1 1 STP,\ for somp R f= IH r^C\ 
\ y i * i y j lj 1 ±j r iui ionic ±j c ll/, j 




if 1. 


^>ri .S 1 f; C A t\ t is not ISIoplrpd and 




2. 


thprp arp nn ti ,**»-npi trhhoiirs 7/t 7/ snph that i; ^ £ ,( n \ 

L11V.1L^ alt 11W /t i_J lltlgllL'VJ LL1 J ty/X 5 * * * 5 ;7?1 JLl^ll lllcll L_y \Z Uj 1 






and 7/,- ^ 7/^.- for 1 <T 7" <^ i <T n 




then 


create n new nodes 7/1 11^ with £a It il \ 1 — T,S'T 






£ A ii -\ — X CJ \ and?/- it ■ for 1 <* 7 <r*" i* <* n 

^yyi) — l^ji clIlu i/2 t y^ 1U1 1 _^ L ^ J _^ 


<T" -i*ii1p* 
_ 1 U1C. 


if 1 

11 1 . 


<Tn f i,| ti T* ic not i ndifpr'tlx/ nlnr , iV"pd and 
/ i iJ . O AjIX (, Ju 1j llOL lllLlllCLLiy UlUUvCU-, dllLl 






jjo i^x, kj j /t, Liieie die o-neigiiuuuio </, z oi ,l wilii iiol y ^= z, 






7/ is neither a root node nor an ancestor of z and (1 G £j(ii\ Pi £>( z\ 




then 


1. — > L(z)UH(y) and 






2. if z is an ancestor of x 






then £((2!, a;)) — > H((z,x))l)\m(L((x,y))) 






else £((x,z)) — > Cj((x, z)) U L((x,y)) 






3.L(( X ,y})^® 






4. Set u ^ z for all u with u ^ y 


^ r -rule: 


if 1. 


^nS.C e L(x), and 




2. 


DS^a:, C) > n and there are two ^-neighbours y, z of x 






which are both root nodes, C £ L(y) f)L(z), and not y ^ z 




then 


1. L(z) — ► £(z) U£(y) and 



2. For all edges (y, w): 

i. if the edge (z, w) does not exist, create it with &((z, w) ) := 
n.L((z,w))^L((z,w))UL((y,w}) 

3. For all edges (w,y): 

i. if the edge (w, z) does not exist, create it with L({w, z)) := 

ii. H({w,z)) -^H({w,z))U H({w, y)) 

4. Set L(y) :— and remove all edges to/from y. 

5. Set u ^ z for all u with u y and set y = z. 



Figure 1: The Expansion Rules for 5HIQ-Aboxes. 
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<-rule 




2 •&(z) 

/ £«x, Z » 



f t 




£((x,z>) U C({x,yj) 



<-mle 




< r -rule 




z 9L(z)UL(y)yOL(y) 
/t\ / \ 



/ £«x,2»Ulnv(£«x, ! /») 



Ot(s) 




| z»U C((w lt y)) 

'£((x,z»U£((x,j/» 



: »£(z)U£(j/) y »0 




Figure 2: Effect of the ^- and the < r -rule 



sets them to 0. If an edge label is set to by the ^-rule, the node below this edge 
is blocked and will remain blocked forever. The ^ r -rule only sets the label of a 
root node x to 0, and after this, x's label is never changed again since all edges 
to/from x are removed. Since no root nodes are generated, this removal may 
only happen a finite number of times, and the new edges generated by the 
rule guarantees that the resulting structure is still a completion forest. 

2. Nodes are labelled with subsets of clos(^4) and edges with subsets of R_^, so 
there are at most 2 2mn different possible labellings for a pair of nodes and an 
edge. Therefore, if a path p is of length at least 2 2mn , the pair-wise blocking 
condition implies the existence of two nodes x, y on p such that y directly blocks 
y. Since a path on which nodes are blocked cannot become longer, paths are of 
length at most 2 2m ". 
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3. Only the 3- or the ^-rule generate new nodes, and each generation is triggered 
by a concept of the form 3R.C or ^nR.C in clos(.A). Each of these concepts 
triggers the generation of at most n max successors yc note that if the or 
the ^ r -rule subsequently causes £j((x, yi)) to be changed to 0, then x will have 
some i?-neighbour z with L(z) D &(y)- This, together with the definition of a 
clash, implies that the rule application which led to the generation of y t will not 
be repeated. Since clos(.4) contains a total of at most m 3R.C, the out-degree 
of the forest is bounded by mn max n. ■ 

Lemma 3.5 Let Abe a STLXQ-Abox and 1Z a role hierarchy. If the expansion rules 
can be applied to A and 1Z such that they yield a complete and clash-free completion 
forest, then A has a tableau w.r.t. 1Z. 

Proof: Let T be a complete and clash-free completion forest. The definition of a 
tableau T — (S,£,£,J) from T works as follows. Intuitively, an individual in S 
corresponds to a path in T from some root node to some node that is not blocked, and 
which goes only via non-root nodes. 

More precisely, a path is a sequence of pairs of nodes of T of the form p = 
. . . , ]. For such a path we define Tail(p) := x n and Tail'(p) := x' n . With 
[plf?^]. we denote the path [#,...,§?-, §^1. The set Paths(T) is defined indue- 

Tl-\- 1 TL TV | 1 

tively as follows: 

• For root nodes x\ of J 7 , [f|] G Paths(J r ), and 

• For a path p G Paths(.F) and a node z in T: 

- if z is a successor of Tail(p) and z is neither blocked nor a root node, then 
[p|f] G Paths(^),or 

- if, for some node y in T, y is a successor of Tail(p) and z blocks y, then 

G Paths(.F). 

Please note that, since root nodes are never blocked, nor are they blocking other nodes, 
the only place where they occur in a path is in the first place. Moreover, if p G 
PathsfT), then Tail(p) is not blocked, Tail(p) = Tail'(p) iff Tail'(p) is not blocked, 
and£(Tail(p)) = £(Tail'(p)). 

We define a tableau T = (S, L, £, 3) as follows: 

S = Paths( J") 
£(f>)=£(Tail(p)) 

£(i?) = {(p, G S x S I x' is an i?-successor of Tail(p)} U 

{([g|p-], <?} G S x S | x' is an lnv(i?)-successor of Tail(q)} U 

{([§ ] j [~D e S x S | x, y are root nodes, and y is an i?-neighbour of x} 




[2ft] if Xn is a root node in T with £(xg) ^ 

Xq 

[^j] if £(x ) = 0, Xq a root node in T with £(x ) ^ and x = x 
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Please note that Z>(x) = implies that £ is a root node and that there is another root 
node y with £(y) 7^ and x = y. We show that T is a tableau for D. 

• T satisfies (PI) because T is clash-free. 

• (P2) and (P3) are satisfied by T because T is complete. 

• For (P4), let p,q £ S with Vi?.C G £(p), (p,g) £ £(#). If g = \p\§,], then 
a;' is an i?-successor of Tail(p) and, due to completeness of T, G G £(x') — 
L(x) = £(g). If p = [q\jh]> then x' is an lnv(i?)-successor of Tail(g) and, due 
to completeness ofT,C G £(Tail(g)) = £(g). If p = [f ] and q = [|] for two 
root nodes x, x, then y is an i?-neighbour of x, and completeness of T yields 
C G £(y) = £(<?)■ (P6) and (Pll) hold for similar reasons. 

• For (P5), let 3R.C G £(p) and Tail(p) = x. Since x is not blocked and T 
complete, x has some i?-neighbour y with C G L(y). 

- If y is a successor of x, then y can either be a root node or not. 

* If y is not a root node: if y is not blocked, then q := [p\ G S; if y is 
blocked by some node z, then g := G S. 

* If y is a root node: since y is a successor of x, x is also a root node. 
This implies p = [f ] and g = [|] e S. 

- x is an \r\\i{R)-successor of y, then either 

* p = [q\%] withTail(g) = y. 

* P — [q\§r] with Tail(g) = u 7^ y. Since x only has one predecessor, 
u is not the predecessor of x. This implies x 7^ x', x blocks x', and 
u is the predecessor of x' due to the construction of Paths. Together 
with the definition of the blocking condition, this implies L((u, x')) = 
&((y, x)) as well as L(u) = L(y) due to the blocking condition. 

* p = [-] with x being a root node. Hence y is also a root node and 
*=[«]■ 

In any of these cases, {p, q) G £(R) and C G £(g). 

• (P7) holds because of the symmetric definition of the mapping £. 

• (P8) is due to the definition of i?-neighbours and i?-successor. 

• Suppose (P9) were not satisfied. Hence there is some p G S with (^nS.C) G 
£(p) and ))S' T (p, C) > n. We will show that this implies jjS^Tail^), C) > n, 
contradicting either clash-freeness or completeness of T. Let x := Tail(p) and 
P := S T {p 1 C). We distinguish two cases: 

- P contains only paths of the form \p\^r\ and [^-]. Then jJP > n is im- 

possible since the function Tail' is injective on P: if we assume that there 
are two distinct paths g 1; g 2 G P and Tail'(gi) = Tail'(g 2 ) = y', then this 
implies that each g» is of the form g, = or g, = [^7]. From gi 7^ g 2 , 
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we have that qi = [p\y] holds for some i G {1, 2}. Since root nodes occur 
only in the beginning of paths and qi ^ q 2 , we have q\ = [p|(yi, y')] and 
q 2 = [p\{y 2 ,y')\. If y' is not blocked, then yi = y' = y 2 , contradicting 
qi ^ q 2 . If y' is blocked in T, then both y\ and y 2 block y', which im- 
plies yi — y 2 , again a contradiction. Hence Tail' is injective on P and thus 
JjP = (I Tail'(P). Moreover, for each y' G Tail'(P), y' is an S-successor of 
sandCe £(?/). This implies jjS^x, C) > n. 

- P contains a path q where p — [q\§r\- Obviously, P may only contain one 
such path. As in the previous case, Tail' is an injective function on the set 
P' := P \ {q}, each y 1 G Tail'(P') is an S'-successor of x, and C £ £j(y') 
for each y' G Tail'(P'). Let z := Ta'\\(q). We distinguish two cases: 

* x = x'. Hence x is not blocked, and thus x is an lnv(S)-successor 
of z. Since Tail'(P') contains only successors of x we have that z £ 
Tail'(P') and, by construction, z is an S-neighbour of x with C G 

H(z). 

* x 7^ x'. This implies that x' is blocked by x and that x' is an Inv(S)- 
successor of z. Due to the definition of pairwise-blocking this implies 
that x is an lnv(S)-successor of some node u with L{u) = H(z). 
Again, u ^ Tail'(P') and, by construction, u is an 5-neighbour of x 
andCe£(w). 

For (P10), let (^nS.C) G &(p)- Hence there are n 5-neighbours y\, . . . , y n of 
x = Tail(p) in T with C G ^(j/i). For each yi there are three possibilities: 

- yi is an 5-successor of x and yi is not blocked in T. Then % := [p\^j-] or 
yi is a root node and qi :— [^-] is in S. 

- yi is an S'-successor of x and yi is blocked in T by some node z. Then 
Qi = [p\f~] i s m Since the same z may block several of the yj&, it is 
indeed necessary to include yi explicitly into the path to make them distinct. 

- a: is an lnv(S)-successor of yi. There may be at most one such yi if x is not 
a root node. Hence either p — [qi\^r] with Tail(<7j) = yi, or p = [-] and 

Hence for each yi there is a different path qi in S with S G H((p,qi)) and 
C G Z(qi), and thus (JS T (p, C) > n. 

(P12) is due to the fact that, when the completion algorithm is started for an 
Abox A, the initial completion forest Ta contains, for each individual name 
occurring in A, a root node x l with £j(xq) — {C G clos(^l) | : C G A}. 
The algorithm never blocks root individuals, and, for each root node x l whose 
label and edges are removed by the < r -rule, there is another root node x 3 with 
Xq = x 3 and {C G clos(^4) | : C G ^4} C L(x 3 ). Together with the definition 
of 3, this yields (P12). (P13) is satisfied for similar reasons. 

(P14) is satisfied because the < r -rule does not identify two root nodes x z ,yl 
when Xq ^ y % holds. ■ 
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Lemma 3.6 Let Abe a STCIQ-Abox and 1Z a role hierarchy. If A has a tableau w.r.t. 
1Z, then the expansion rules can be applied to A and 1Z such that they yield a complete 
and clash-free completion forest. 



Proof: Let T = (S,£, £,3) be a tableau for A and TZ. We use T to trigger the 
application of the expansion rules such that they yield a completion forest T that is 
both complete and clash-free. To this purpose, a function 7r is used which maps the 
nodes of T to elements of S. The mapping 7r is defined as follows: 

• For individuals a, in A, we define 7t(xq) := J(aj). 

• If ir(x) — s is already defined, and a successor y of x was generated for 3R.C G 
L(x), then n(y) = t for some t G S with C 6 £(t) and (s, t) G £(i2). 

• If tt(x) = s is already defined, and successors yi of x were generated for 
^nR.C G £-(a;), then 7r(t/j) = U for n distinct U G S with C G and 



Obviously, the mapping for the initial completion forest for A and 1Z satisfies the fol- 
lowing conditions: 



It can be shown that the following claim holds: 

CLAIM: Let T be generated by the completion algorithm for A and 1Z and let 7r satisfy 
(*). If an expansion rule is applicable to T, then this rule can be applied such that it 
yields a completion forest T' and a (possibly extended) 7r that satisfy (*). 

As a consequence of this claim, (PI), and (P9), if A and 1Z have a tableau, then 
the expansion rules can be applied to A and TZ such that they yield a complete and 
clash-free completion forest. ■ 

From Theorem 2.4, Lemma 3.2, 3.4 3.5, and 3.6, we thus have the following theo- 
rem: 

Theorem 3.7 

The completion algorithm is a decision procedure for the consistency of SHIQ-Aboxes 
and the satisfiability and subumption of concepts with respect to role hierarchies and 
terminologies. 

4 Conclusion 

We have presented an algorithm for deciding the satisfiability of SriXQ KBs where the 
Abox may be non-empty and where the uniqueness of individual names is not assumed 
but can be asserted in the Abox. This algorithm is of particular interest as it can be 
used to decide the problem of conjunctive query containment w.r.t. a schema [17]. 



(s,ti) G £(R). 



/L(x) C L(n(x)), 

if y is an S'-neighbour of x, then (ir(x),ir(y)) G E(S), and 
x jt= y implies ir(x) ^ n(y). 



(*) 
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An implementation of the STilQ Tbox satisfiability algorithm is already available 
in the FaCT system [14], and is able to reason efficiently with Tboxes derived from 
realistic ER schemas. This suggests that the algorithm presented here could form the 
basis of a practical decision procedure for the query containment problem. Work is 
already underway to test this conjecture by extending the FaCT system with an imple- 
mentation of the new algorithm. 
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